Training data generating method for image processing, image processing method, and devices thereof

ABSTRACT

An image processing method and an image processing device detect an object from a driving image of a vehicle, obtain information on an altitude difference between the vehicle and the object, and input image domain coordinates of the object in the driving image and the information on the altitude difference to a neural network and determine world domain coordinates of the object.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority from Korean Patent Application No.10-2018-0109024, filed on Sep. 12, 2018, in the Korean IntellectualProperty Office, the disclosure of which is herein incorporated byreference in its entirety.

BACKGROUND

Embodiments of the disclosure relate to a training data generatingmethod for image processing, an image processing method, and devicesthereof.

Recognition and detection of objects for automatic driving may beperformed through a driving image of a vehicle. At this time, anon-linear transition by a homography operation may be used forrebuilding two-dimensional (2D) image domain coordinates tothree-dimensional (3D) world domain coordinates.

However, the transition of the 2D image domain coordinates into the 3Dworld domain coordinates by the homography operation may be incorrectand a large error may occur when a lane and an object are detected and aposition of a vehicle is estimated. Such an error causes instabilitywhen the vehicle is driven. In particular, correctness may remarkablydeteriorate in a lamp section in which an altitude of a road varies.

SUMMARY

According to an aspect of an embodiment of the disclosure, there isprovided an image processing method including detecting an object withina driving image, obtaining an altitude difference between the vehicleand the object, determining world domain coordinates of the object by aneural network processing of image domain coordinates of the object inthe driving image and the altitude difference, and controlling thevehicle on the road with respect to the object based on the world domaincoordinates of the object.

The altitude difference may include pitch information of the vehicle andvanishing line information in the driving image.

The image processing method may further include tracking the imagedomain coordinates of the object with the lapse of time and filteringthe tracked image domain coordinates of the object and converting a typeof the image domain coordinates of the object into a floating point.

The image processing method may further include performing scalingnormalization on the image domain coordinates of the object in thedriving image based on the vanishing line information in the drivingimage.

The object includes a dynamic object with mobility and a still objectwithout mobility and the neural network may include at least one of afirst neural network for estimating world domain coordinates of thedynamic object and a second neural network for estimating world domaincoordinates of the still object.

When the object is the dynamic object with mobility, the imageprocessing method may further include generating a live mapcorresponding to the dynamic object by using a result of convertingimage domain coordinates of the dynamic object into the world domaincoordinates and generating a driving parameter of the vehicle forcontrolling the vehicle on the road with respect to the dynamic objectby using the live map.

When the object is the still object without mobility, the imageprocessing method may further include generating a landmark mapcorresponding to the still object by using a result of converting imagedomain coordinates of the still object into the world domain coordinatesand determining at least one of a position and a route of the vehiclefor controlling the vehicle on the road with respect to the still objectby using the landmark map.

The image processing method may further include outputting world domaincoordinates of the object.

The image processing method may further include obtaining the drivingimage captured by a camera mounted in the vehicle.

According to an aspect of an embodiment of the disclosure, there isprovided a training data generating method including obtaining imagedomain coordinates of dynamic objects by tracking the dynamic objectswithin a driving image, converting image domain coordinates of a firstdynamic object among the dynamic objects into first world domaincoordinates of the first dynamic object, wherein the first dynamicobject is positioned within a predetermined matching distance from avehicle, obtaining second world domain coordinates of peripheral objectsby tracking the peripheral objects by using a distance sensor, matchingone of the peripheral objects with the first dynamic object by comparingthe first world domain coordinates with the second world domaincoordinates, and generating training data including the image domaincoordinates of the first dynamic object and the second world domaincoordinates of the matched peripheral object.

The converting of the image domain coordinates of the first dynamicobject into the first world domain coordinates may include convertinginitial image domain coordinates of the first dynamic object into thefirst world domain coordinates by a homography operation.

The training data generating method further includes associating a firstidentifier (ID) with the first dynamic object and associating second IDswith the peripheral objects. The matching of one of the peripheralobjects with the first dynamic object may include matching a second IDamong the second IDs associated with one of the peripheral objects withthe first ID associated with the first dynamic object.

The dynamic objects may include at least one of peripheral vehicles,pedestrians, and animals.

The training data generating method may further include tracking theimage domain coordinates of the dynamic objects over lapse of time andconverting a type of the image domain coordinates of the dynamic objectsinto a floating point by filtering the tracked image domain coordinatesof the dynamic objects.

The training data generating method may further include performingscaling normalization on the image domain coordinates of the dynamicobjects in the driving image based on vanishing line information in thedriving image.

According to an aspect of an embodiment of the disclosure, there isprovided a training data generating method including storing imagedomain coordinates of a still object by tracking the still object from adriving image including a plurality of frames over lapse of time,converting image domain coordinates of a current frame among the imagedomain coordinates into first global world domain coordinates based onglobal positioning system (GPS) information, obtaining second globalworld domain coordinates of peripheral objects based on an output of adistance sensor and the GPS information, matching one of the peripheralobjects with the still object by comparing the first global world domaincoordinates with the second global world domain coordinates, andgenerating a plurality of training data, each training data of theplurality of training data includes one of the stored image domaincoordinates and second global world domain coordinates of the matchedperipheral object.

The training data generating method further includes providing a firstID to the still object and providing second IDs to the peripheralobjects. The matching one of the peripheral objects with the stillobject may include matching a second ID provided to one of theperipheral objects with the first ID provided to the still object.

The still object may include at least one of buildings, signs, trafficlights, a crosswalk, a stop line, and a driving line included in thedriving image.

The training data generating method may further include tracking theimage domain coordinates of the still object over lapse of time andconverting a type of the image domain coordinates of the still objectinto a floating point by filtering the tracked image domain coordinatesof the still object.

The training data generating method may further include performingscaling normalization on the image domain coordinates of the stillobject in the driving image based on vanishing line information in thedriving image.

The training data generating method of claim may further includeaccumulatively storing the output of the distance sensor and the GPSinformation.

According to an aspect of an embodiment of the disclosure, there isprovided an image processing device including a processor for detectingan object within a driving image, obtaining an altitude differencebetween the vehicle and the object, determining world domain coordinatesof the object by a neural network processing of image domain coordinatesof the object in the driving image and the altitude difference, andcontrolling the vehicle on the road with respect to the object based onthe world domain coordinates of the object.

The altitude difference may include pitch information of the vehicle andvanishing line information in the driving image.

The processor tracks the image domain coordinates of the object with thelapse of time and filters the tracked image domain coordinates of theobject to convert a type of the image domain coordinates of the objectinto a floating point.

The processor may perform scaling normalization on the image domaincoordinates of the object in the driving image based on vanishing lineinformation in the driving image.

The object may include a dynamic object with mobility and a still objectwithout mobility and the neural network may include at least one of afirst neural network for estimating world domain coordinates of thedynamic object and a second neural network for estimating world domaincoordinates of the still object.

When the object is the dynamic object with mobility, the processorgenerates a live map corresponding to the dynamic object by using aresult of converting image domain coordinates of the dynamic object intothe world domain coordinates and may generate a driving parameter of thevehicle for controlling the vehicle on the road with respect to thedynamic object by using the live map.

When the object is the still object without mobility, the processorgenerates a landmark map corresponding to the still object by using aresult of converting image domain coordinates of the still object intothe world domain coordinates and may determine at least one of aposition and a route of the vehicle for controlling the vehicle on theroad with respect to the still object by using the landmark map.

The processor may output world domain coordinates of the object tocorrespond to the object.

The image processing device may further include a camera mounted in thevehicle to capture the driving image.

According to an aspect of an embodiment of the disclosure, there isprovided a training data generating device including a processor forobtaining image domain coordinates of dynamic objects by tacking thedynamic objects within a driving image, converting image domaincoordinates of a first dynamic object positioned within a predeterminedmatching distance from a vehicle, from among the dynamic objects intofirst world domain coordinates of the first dynamic object, obtainingsecond world domain coordinates of peripheral objects by tracking theperipheral objects by using a distance sensor, matching one of theperipheral objects with the first dynamic object by comparing the firstworld domain coordinates with the second world domain coordinates, andgenerating training data including the image domain coordinates of thefirst dynamic object and second world domain coordinates of the matchedperipheral object.

According to an aspect of an embodiment of the disclosure, there isprovided a training data generating device including a processor forstoring image domain coordinates of a still object by tracking the stillobject from a driving image including a plurality of frames over lapseof time, converting image domain coordinates of a current frame amongthe image domain coordinates into first global world domain coordinatesbased on global positioning system (GPS) information, obtaining secondglobal world domain coordinates of peripheral objects based on an outputof a distance sensor and the GPS information, matching one of theperipheral objects with the still object by comparing the first globalworld domain coordinates with the second global world domaincoordinates, and generating a plurality of training data, each trainingdata of the plurality of training data includes one of the stored imagedomain coordinates and second global world domain coordinates of thematched peripheral object.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the disclosure will be more clearly understood from thefollowing detailed description taken in conjunction with theaccompanying drawings in which:

FIG. 1 is a view illustrating a method of reconstructing two-dimensional(2D) image domain coordinates of a driving image to three-dimensional(3D) world domain coordinates, according to an embodiment of thedisclosure;

FIG. 2 is a flowchart illustrating an image processing method accordingto an embodiment of the disclosure;

FIG. 3A is a view illustrating information on a difference in altitudeaccording to an embodiment of the disclosure;

FIG. 3B is a view illustrating a method of obtaining information on avanishing line, according to an embodiment of the disclosure;

FIG. 4 is a view illustrating a configuration of an image processingdevice according to an embodiment of the disclosure;

FIG. 5 is a view illustrating a structure of a neural network accordingto an embodiment of the disclosure;

FIG. 6A is a view illustrating a driving image including a vanishingline;

FIG. 6B is a view illustrating scaling normalization of X-axis accordingto an embodiment of the disclosure;

FIG. 7A is a view illustrating another driving image including avanishing line;

FIG. 7B is a view illustrating scaling normalization of Y-axis accordingto an embodiment of the disclosure;

FIG. 8 is a view illustrating a conversion of a floating point accordingto an embodiment of the disclosure;

FIG. 9 is a flowchart illustrating an image processing method accordingto an embodiment of the disclosure;

FIG. 10 is a flowchart illustrating a method of generating training databased on coordinates of dynamic objects in a driving image, according toan embodiment of the disclosure;

FIG. 11 is a configuration diagram of a training data generating devicefor dynamic objects according to an embodiment of the disclosure;

FIG. 12A are views illustrating driving images captured by a vehiclemoving on a road;

FIG. 12B is a view illustrating a method of generating training data fora dynamic object in a driving image, according to an embodiment of thedisclosure;

FIG. 13 is a view illustrating a method of accumulatively generatingtraining data, according to an embodiment of the disclosure;

FIG. 14 is a flowchart illustrating a method of generating training databased on coordinates of a still object in a driving image, according toan embodiment of the disclosure;

FIG. 15 is a configuration diagram of a training data generating devicefor still objects according to an embodiment of the disclosure;

FIG. 16 is a view illustrating a method of generating training data on astill object in a driving image, according to an embodiment of thedisclosure; and

FIG. 17 is a block diagram of an image processing device according to anembodiment of the disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS

Specific structural or functional descriptions disclosed in the currentspecification are provided in order to describe embodiments inaccordance with a descriptive concept. The subject matter of thedisclosure may be embodied in many different forms and should not beconstrued as being limited to the embodiments set forth herein.

While such terms as “first,” “second,” etc., may be used to describevarious components, such components are not limited to the above terms.The above terms are used only to distinguish one component from another.For example, a first component may indicate a second component or asecond component may indicate a first component without conflicting.

When a certain component is referred to as being “connected” to anothercomponent, the component may be directly connected to the othercomponent. However, it may be understood that a different component mayintervene.

Singular expressions, unless defined otherwise in contexts, includeplural expressions. The terms “comprises” or “may comprise” used hereinin various example embodiments may indicate the presence of acorresponding function, operation, or component and do not limit one ormore additional functions, operations, or components. It will be furtherunderstood that the terms “comprises” and/or “comprising,” when used inthis specification, may be used to specify the presence of statedfeatures, integers, steps, operations, elements, and/or components, butdo not preclude the presence or addition of one or more other features,integers, steps, operations, elements, components, and/or groupsthereof.

When a certain embodiment may be implemented differently, a specificprocess order may be performed differently from the described order. Forexample, two consecutively described processes may be performedsubstantially at the same time or performed in an order opposite to thedescribed order. Variations from the shapes of the illustrations as aresult, for example, of manufacturing techniques and/or tolerances, areto be expected. Thus, embodiments of the disclosure should not beconstrued as limited to the particular shapes of regions illustratedherein, but are to include deviations in shapes that result, forexample, from manufacturing.

The embodiments to be described hereinafter may be used for displaying alane in an augmented reality navigation system such as a smart vehicleor generating visual information for helping to steer an autonomousvehicle. In addition, the embodiments may be used for interpretingvisual information and helping stable and comfortable driving in adevice including an intelligence system such as a head up display (HUD)provided tor driving assistance or complete autonomous driving in avehicle. The embodiments may be used for an autonomous vehicle, a smartvehicle, a smart phone, and a mobile device. Hereinafter, theembodiments will be described in detail with reference to theaccompanying drawings. The same reference numeral denotes the samemember.

Hereinafter, ‘a road’ may be an expressway, a national highway, a localroad, or a national expressway on which vehicles are driven. The roadmay include one or a plurality of lanes. ‘A driving lane’ may correspondto a lane used by a driving vehicle among a plurality of lanes. ‘Lanes’may be distinguished from each other by lane marking displayed on a roadsurface. A lane may be defined by lane marking on the right and left ofthe road. ‘The road marking’ on the road surface on which the vehicle isdriven may include lane marking such as a centerline or a taxiway line,a symbol such as a lane change line, a no left-turn, a progressdirection guide line, or a crosswalk, or non-lane marking such ascharacters such as a children protection zone or slow down.

FIG. 1 is a view illustrating a method of reconstructing two-dimensional(2D) image domain coordinates of a driving image to three-dimensional(3D) world domain coordinates according to an embodiment of thedisclosure.

Referring to FIG. 1, a 2D driving image 110 of a vehicle and a 3D image130 corresponding to the 2D driving image 110 are illustrated. The 3Dimage 130 may be a top view image on world domain and may include depthinformation.

A detection system of the vehicle detects vehicles, people, trafficlights, signs, lanes, and road conditions. The vehicle avoids collisionby detecting peripheral vehicles, driving crossroads, and road markingsincluding the lanes, and may perform route search and perpendicular andhorizontal direction control by identifying and/or detecting the signsand the traffic lights. Hereinafter, ‘the vehicle’ may include anautomatic driving function and/or an advanced driver assistance (ADA)function.

The 2D driving image 110 may be captured by a capturing device duringroad driving. The capturing device may be mounted in the front of thevehicle, the side of the vehicle, the top of the vehicle, the bottom ofthe vehicle, the rear of the vehicle, or any one or combination of allof the above. The 2D driving image 110 may include various peripheralobjects such as peripheral vehicles 111, a lane 113, street lamps 115,and a crosswalk 117.

An image processing device according to an embodiment may convert imagedomain coordinates of peripheral objects detected by analyzing the 2Ddriving image 110 to 3D world domain coordinates. The image processingdevice may employ a neural network to convert the image domaincoordinates of peripheral objects detected by analyzing the 2D drivingimage 110 to 3D world domain coordinates. Peripheral vehicles 131, alane 133, street lamps 135, and a crosswalk 137 may be displayed in the3D image 130, similar to the indication thereof in the 2D driving image110.

The image processing device may control a vehicle to drive whilemaintaining an inter-vehicle distance by correlating dynamic objectsdetected from an image domain of the 2D driving image 110 into a 3Dworld domain. In addition, the image processing device may control thevehicle to drive while maintaining a lane and to generate a route bytransiting still objects detected from a 2D image domain to the 3D worlddomain, and estimating a position of the vehicle.

In addition, the image processing device according to an embodiment mayreduce the load of the neural network by tracking and detecting partialobjects (for example, vehicles, a road, signs, etc.) as targets withoutprocessing the 2D driving image 110.

Hereinafter, for convenience sake, ‘the 2D image domain (coordinates)’will be simply represented as ‘image domain (coordinates)’ and ‘the 3Dworld domain (coordinates)’ will be simply represented as ‘world domain(coordinates).’

FIG. 2 is a flowchart illustrating an image processing method accordingto an embodiment of an embodiment of the disclosure.

Referring to FIG. 2, an image processing device according to anembodiment detects objects from a driving image of a vehicle inoperation S210. The driving image may be obtained by the capturingdevice mounted in the vehicle during the driving of the vehicle tocapture a front view, side views, and other views from the perspectiveof the vehicle. Alternatively, the driving image may be at least oneexternal image among the front view and the side views of the vehicle,which are received from an external source, such as a traffic camera,through a communication interface (refer to a communication interface1770 of FIG. 17).

The driving image may include a road image including peripheralvehicles, a lane, a curb, a sidewalk, and a peripheral environmentand/or a road surface image like the 2D driving image 110 illustrated inFIG. 1. The driving image may include various images such as an infraredimage, a depth image, and a stereo image other than a color image. Thedriving image may include a frame, a plurality of frames, or a video.

Objects detected by the image processing device may be other vehicles, aroad vanishing point, a road marking, pedestrians, vehicles, trafficlights, signs, people, animals, plants, and buildings. However, anembodiment of the disclosure is not limited thereto. The objects mayinclude dynamic objects with mobility and/or still objects withoutmobility. The dynamic objects may include various objects with mobilitysuch as peripheral vehicles, pedestrians, and animals. The still objectsmay include various objects without mobility such as various lanes suchas a crosswalk, a stop line, and a driving line, a road marking, a roadcurb, buildings, signs, plants (trees), lights, and traffic lights.

In operation S210, the image processing device according to anembodiment may detect the objects from the driving image by using aconvolution neural network (CNN) previously trained to recognize theobjects. In the CNN, for example, a bounding box of lane display andnon-lane display to be detected from the driving image and kinds of thelane display and the non-lane display to be detected from the drivingimage may be previously trained.

The image processing device obtains information on an altitudedifference between a vehicle and an object in operation S220. Theinformation on the altitude difference may include, for example,information on a pitch of the vehicle, and information on a vanishingline in the driving image. The information on the altitude differencebetween the vehicle and the object will be described in detail withreference to FIG. 3.

The image processing device inputs image domain coordinates of theobject in the driving image and the information on the altitudedifference to the neural network and determines world domain coordinatesof the object in operation S230. The neural network may determine theworld domain coordinates of the objects including both the dynamicobjects and the still objects. Alternatively, the neural network mayinclude at least one of a first neural network for determining the worlddomain coordinates of the dynamic objects and a second neural networkfor determining the world domain coordinates of the still objects. Aconfiguration and operation of the image processing device according toan embodiment will be described in detail with reference to FIGS. 4 and5.

The image processing device may output the world domain coordinatesdetermined in operation S230 to correspond to the objects. The imageprocessing device may explicitly or implicitly output the world domaincoordinates of the objects to correspond to the objects. ‘Explicitlyoutputting the world domain coordinates of the objects’ may include, forexample, displaying the world domain coordinates of the objects on ascreen (or a map) to correspond to the objects and/or outputting theworld domain coordinates of the objects as audio. Alternatively,‘implicitly outputting the world domain coordinates of the objects’ mayinclude, for example, controlling a vehicle by using the world domaincoordinates of the objects, determining a position of the vehicle, orsetting or changing a route.

According to an embodiment, the image processing device may performscaling normalization on the image domain coordinates of the objects inthe driving image based on the information on the vanishing line in thedriving image. A method, performed by the image processing device, ofperforming the scaling normalization will be described in detail withreference to FIGS. 6 and 7.

Alternatively, according to an embodiment, the image processing devicemay track the image domain coordinates of the objects over the lapse oftime. The image processing device may convert a type of the image domaincoordinates of the objects into a floating point by filtering the imagedomain coordinates of the objects, which are tracked over the lapse oftime. A method, performed by the image processing device, of convertingthe type of the image domain coordinates into the floating point will bedescribed in detail with reference to FIG. 8.

According to an embodiment, the image processing device determineswhether the objects detected in operation S210 either the dynamicobjects with mobility or the still objects without mobility, and mayperform different operations in accordance with the determinationresult. An embodiment in which the image processing device distinguishesthe dynamic objects from the still objects and detects the dynamicobjects and the still objects will be described in detail with referenceto FIG. 9.

FIG. 3A is a view illustrating information on a difference in altitudeaccording to an embodiment of the disclosure.

Referring to FIG. 3A, vehicle pitch information 310 and vanishing lineinformation 320 are illustrated.

The vehicle pitch information 310 may correspond to informationrepresenting a slope or altitude of a vehicle based on the ground. Thevehicle pitch information 310 may be measured by, for example, aninertial measurement unit (IMU) sensor or a gyro sensor. The vehiclepitch information 310 may be represented as ‘p’.

The vanishing line information 320 may represent an altitude of avanishing line of objects in front of the driving image or a vanishingpoint of an altitude at which the objects in front of the driving imageconverge. The vanishing line information 320 may include a position (forexample, a y-coordinate of the vanishing point) of the vanishing pointin the driving image. The vanishing line information 320 may be obtainedfrom the driving image captured by the capturing device such as acamera.

The image processing device may obtain the vanishing line information320 by recognizing an image in the driving image. The vanishing lineinformation 320 may be represented as ‘vl’.

FIG. 3B is a view illustrating a method of obtaining information on avanishing line, according to an embodiment of the disclosure.

According to an embodiment, the image processing device may determinethe vanishing line based on the highest point of a drivable road. Forexample, referring to FIG. 3B, the image processing device extractsregions 341, 342, and 343 of the drivable road based on image processingsuch as deep training and may determine the y-coordinate of a vanishingline 330 based on the highest point of the extracted regions of thedrivable road. When the y-coordinate increases from an upper end of theimage toward a lower end of the image, the highest point of theextracted regions may be a pixel with the smallest y-coordinate amongpixels included in the extracted regions. In addition, the regions ofthe drivable road may be extracted based on a neighboring vehicle.

According to an embodiment, the image processing device extracts aplurality of lanes and may determine a point at which extended lines ofthe plurality of lanes meet as the vanishing point. For example,referring to FIG. 3B, the image processing device extracts two lanes 351and 352 based on the image processing such as the deep training, extendsthe two lanes 351 and 352, and may determine a vanishing point 335. Theimage processing device may determine a y-coordinate of the determinedvanishing point 335 as a y-coordinate of a vanishing line 330.

FIG. 4 is a view illustrating a configuration of an image processingdevice 400 according to an embodiment of the disclosure. FIG. 5 is aview illustrating a structure of a neural network according to anembodiment of the disclosure.

Referring to FIGS. 4 and 5, the image processing device 400 according tothe embodiment may include a camera sensor 410, an IMU sensor 420, and aneural network 430.

The image processing device 400 may detect objects from a driving imageof a vehicle captured by the camera sensor 410. The image processingdevice 400 tracks the detected objects in operation 415 and may inputpoint coordinates (i_(x), i_(y)) of an image domain of the objects tothe neural network 430. In addition, the image processing device 400obtains vanishing line information vl in the driving image of thevehicle captured by the camera sensor 410 and may input the obtainedvanishing line information to the neural network 430. The imageprocessing device 400 may input pitch information p of a vehicle sensedby the IMU sensor 420 to the neural network 430.

The neural network 430 receives the point coordinates (i_(x), i_(y)) ofthe image domain, the pitch information p of a current vehicle, and thevanishing line information vl, estimates world domain coordinates of theobjects based on the input information items, and may output the worlddomain coordinates (W_(x), W_(y)) corresponding to the point coordinates(i_(x), i_(y)) of the image domain. The neural network 430 may includefully-connected layers as illustrated in FIG. 5.

According to an embodiment, the neural network 430 may be trained todistinguish dynamic objects from still objects and to estimate the worlddomain coordinates of the dynamic objects and the still objects, or theneural network 430 may be trained to estimate the world domaincoordinates of the integrated objects without distinguishing the dynamicobjects from the still objects.

The neural network 430 may determine information indicating an altitudedifference between a vehicle and peripheral objects by the pitchinformation and the vanishing line information on the vehicle, anddetermine the world domain coordinates of the objects.

FIGS. 6A-B and 7A-B are views illustrating scaling normalizationaccording to an embodiment of an embodiment of the disclosure.

The image processing device according to an embodiment may perform thescaling normalization on the image domain coordinates of the objects inthe driving image.

As described above, the vanishing line information may include aposition of a vanishing point in the driving image. The scalingnormalization may be performed in an x axis direction and a y axisdirection based on the position of the vanishing point (or a certainregion including the vanishing point). During the scaling normalizationin the x axis direction, an x coordinate of the vanishing point may beconsidered. During the scaling normalization in the y axis direction, ay coordinate of the vanishing point may be considered.

According to an embodiment, the image processing device may perform thescaling normalization based on a predetermined result (for example, anaverage vanishing point position) totally considering vanishing lineinformation included in a plurality of frames. Alternatively, the imageprocessing device obtains the vanishing line information every frame andmay perform the scaling normalization based on information on avanishing line by frame.

Hereinafter, a method of performing the scaling normalization in the xaxis direction is described with reference to FIG. 6 and a method ofperforming the scaling normalization in the y axis direction isdescribed with reference to FIG. 7. For example, the driving imageillustrated in FIGS. 6A and 7A may have resolution of 1920 (width)×1200(length).

In the driving image of FIG. 6A, in a long distance object positionedaround an x coordinate of a vanishing point 610, while a distancebetween pixels is very large on an image domain, the distance betweenthe pixels may be reduced as the distance from the x coordinate of thevanishing point 610 increases. Therefore, in an embodiment, the distancebetween the pixels on the image domain in the long distance object maybe correctly represented by performing normalization in which a scalingratio of a region close to the x coordinate of the vanishing point 610is different from a scaling ratio of a region the further away from thex coordinate of the vanishing point 610.

In more detail, as illustrated in FIG. 6A, a vehicle is positioned at960 pixels, which is about in the middle of the 1920 (the width) pixelsin a horizontal axis of the driving image, and the x coordinate of thevanishing point 610 is positioned at 960 pixels. When the image domaincoordinates are represented to linearly increase regardless of adistance between the long distance object and the vanishing point 610like in a function 630 illustrated in FIG. 6B, the resolution by whichthe distance between the long distance object and the vanishing point610 is represented may be reduced.

Therefore, according to an embodiment, like in a log scale function 650illustrated in FIG. 6B, a slope value of a scaling factor may increasetoward the vanishing point 610 and may be reduced as the distance fromthe vanishing point 610 increases; and accordingly, a region close tothe vanishing point 610 and corresponding to a long distance may berepresented in detail.

The image processing device according to an embodiment may improve theresolution by which the distance between the long distance object andthe vanishing point 610 is represented by performing the scalingnormalization on the x coordinate of the image domain of the longdistance object in the driving image, for example, in the form of thelog scale function 650 based on the vanishing line information in thedriving image.

In the driving image of FIG. 7A, based on 600 pixels as a y coordinateof a vanishing point 710, a region above the 600 pixels in the drivingimage of FIG. 7A corresponds to the sky. Because the sky will be presentin the driving image regardless of a road or a lane that significantlyaffects driving of a vehicle, the sky may not be seriously consideredwhen coordinates of an object are switched. In an embodiment, anoperation amount may be reduced by performing the scaling normalizationon image domain coordinates of an object included in a region (600 to1,200 pixels) excluding a region corresponding to 0 to 600 pixels basedon they coordinate of the vanishing point 710.

In the driving image of FIG. 7A, around the 600 pixels a y coordinate ofthe vanishing point 710, while a distance between pixels is very largeon an image domain, the distance between the pixels may be reduced asthe distance from the 600 pixels as the y coordinate of the vanishingpoint 710 increases. Without considering the above, when the imagedomain coordinates are represented to linearly increase regardless ofthe distance between the long distance object and the vanishing point710, like in a function 730 illustrated in FIG. 7B, the resolution bywhich the distance between the long distance object and the vanishingpoint 710 is represented may be reduced.

Therefore, in an embodiment, the resolution by which the distancebetween the long distance object and the vanishing point 710 isrepresented may improve by performing normalization in which a scalingratio of a region close to the y coordinate of the vanishing point 710is different from a scaling ratio of a region as the distance from the ycoordinate of the vanishing point 710 increases.

FIG. 8 is a view illustrating a conversion of a floating point accordingto an embodiment of the disclosure.

Referring to FIG. 8, drawings 810, 820, and 830 representing imagedomain coordinates of an object, which are tracked over the lapse oftime, are illustrated.

For example, as illustrated in the drawing 810, image domain coordinates(i_(x0), i_(y0)) of the object in the driving image captured at a pointin time at which t=n may be in the form of an integer. When the imageprocessing device normalizes the image domain coordinates, the imagedomain coordinates in the form of an integer may be discretelyrepresented.

In an embodiment, a type of the image domain coordinates of the objectmay be converted into the floating point by tracking the image domaincoordinates of the object over the lapse of time and filtering thetracked image domain coordinates of the object.

For example, the image processing device tracks image domain coordinatesof an object and may filter the image domain coordinates (i_(x1),i_(y1)) of the object, which are tracked at a point in time at whicht=n+1, as illustrated in the drawing 820, such asi_(x1)←αxi_(x1)+(1−α)xi_(x0), i_(y1)←αxi_(y1)+(1−α)xi_(y0). In addition,the image processing device filters image domain coordinates (i_(x2),i_(y2)) of an object, which are tracked at a point in time at whicht=n+2, as illustrated in the drawing 830, such asi_(x2)←αxi_(x2)+(1−α)xi_(x1), i_(y2)←αxi_(y2)+(1−α)xi_(y1) and mayconvert the filtered image domain coordinates into floating pointcoordinates.

The image processing device according to an embodiment performs thescaling normalization on the image domain coordinates of the object,which are converted into the floating point, based on the vanishing lineinformation in the driving image and may input the image domaincoordinates, on which the scaling normalization is performed, to theneural network.

FIG. 9 is a flowchart illustrating an image processing method accordingto an embodiment of the disclosure.

Referring to FIG. 9, the image processing device according to anembodiment may detect an object from a driving image of a vehicle inoperation S910. The image processing device may determine whether thedetected object is a dynamic object with mobility or a still objectwithout mobility in operation S920. The image processing device maydetermine whether the detected object is the dynamic object or the stillobject by various machine training or various neural networks.

When it is determined in operation S920 that the object is a dynamicobject, the image processing device may generate a live mapcorresponding to the dynamic object by using a first neural network, aresult obtained by the first neural network converting image domaincoordinates of the dynamic object into world domain coordinates, inoperation S930.

The image processing device may generate a driving parameter of avehicle by using the live or dynamically updated map in operation S940.The driving parameter may include a driving angle control parameter, anacceleration control parameter, a deceleration control parameter, and/ora turn signal lamp control parameter. The driving parameter generated bythe image processing device may be used for preventing a vehicle fromcolliding with another vehicle.

When it is determined in operation S920 that the object is a stillobject, the image processing device may generate a landmark mapcorresponding to the still object by using a second neural network, aresult obtained by the second neural network converting image domaincoordinates of the dynamic object into world domain coordinates, inoperation S950.

The image processing device may determine at least one of a localizationand global path of the vehicle by using the landmark map in operationS960.

Hereinafter, a method, performed by the image processing device, oftraining the neural network will be described.

The image processing device according to an embodiment may train theneural network (NN) through remote distance data obtained by ahomography operation. In the homography operation, by a uniformconversion relationship that establishes among projected correspondingpoints when a plane is projected to another plane, coordinates of theother plane are determined. In general, the homography operation hashigh reliability at a short range. Therefore, the image processingdevice may use point coordinates of the 2D image domain matched at ashort range on a 3D world domain as initial training data.

Then, the image processing device may train the neural network whilegradually increasing a collection distance of training data. Thecollection distance of the training data is gradually increased toprevent the image domain coordinates of the 2D driving image (the 2Dimage domain) and the 3D world domain coordinates from being erroneouslymatched.

When the training data is accumulated, the image processing deviceaccording to an embodiment may correctly collect the training data byproviding an identifier (ID) to the dynamic object and/or the stillobject on the image domain of the 2D driving image, collectingsequential data, and matching the collected data with data collected bya distant sensor.

The image processing device according to an embodiment may also performauto-calibration of converting the 2D image domain coordinates into the3D world domain coordinates reflecting that a position or pose of thecamera sensor or the distant sensor is twisted by accumulating thetraining data in real time and training the neural network by using theaccumulated training data.

FIG. 10 is a flowchart illustrating a method of generating training databased on coordinates of dynamic objects in a driving image, according toan embodiment of an embodiment of the disclosure.

Referring to FIG. 10, a training data generating device (hereinafter,referred to as ‘a generating device’) according to an embodiment obtainsthe image domain coordinates of the dynamic objects by tracking thedynamic objects by analyzing the driving image in operation S1010. Anexample of a configuration of the training data generating deviceaccording to an embodiment will be described in detail with reference toFIG. 11.

According to an embodiment, the training data generating device maytrack the image domain coordinates of the dynamic objects over the lapseof time. The training data generating device may convert a type of theimage domain coordinates of the dynamic objects into the floating pointby filtering the tracked image domain coordinates of the dynamicobjects. Alternatively, the training data generating device may performthe scaling normalization on the image domain coordinates of the dynamicobjects in the driving image based on the vanishing line information inthe driving image.

The training data generating device converts image domain coordinates ofa first dynamic object positioned within a predetermined matchingdistance among the dynamic objects into first world domain coordinatesin operation S1020. In operation S1020, the training data generatingdevice may convert the image domain coordinates of the first dynamicobject into the first world domain coordinates by the homographyoperation. The matching distance may be, for example, 15 m or 30 m. Thetraining data generating device may provide a first ID to the firstdynamic object.

The training data generating device obtains second world domaincoordinates of peripheral objects by tracking the peripheral objects byusing the distance sensor in operation S1030. The distance sensor maybe,for example, a Lidar sensor or a radar sensor. At this time, thetraining data generating device may provide second IDs to the peripheralobjects.

The training data generating device matches one of the peripheralobjects with the first dynamic object by comparing the first worlddomain coordinates with the second world domain coordinates in operationS1040. The training data generating device compares and matches a secondID provided to one of the peripheral objects with the first ID providedto the first dynamic object.

The training data generating device generates the training dataincluding the image domain coordinates of the first dynamic object andthe second world domain coordinates of the matched peripheral objects inoperation S1050.

A method, performed by the training data generating device according toan embodiment, of generating the training data will be described indetail with reference to FIGS. 12 and 13.

FIG. 11 is a configuration diagram of a training data generating device1100 for dynamic objects according to an embodiment of the disclosure.

Referring to FIG. 11, the training data generating device 1100 accordingto an embodiment may include a camera sensor 1110, a distance sensor1120, an IMU sensor 1130, and a processor 1170.

The training data generating device 1100 may detect dynamic objects froma driving image of a vehicle captured by the camera sensor 1110. Thetraining data generating device 1100 may obtain the image domaincoordinates of the dynamic objects by tracking 1140 the dynamic objectsby using the driving image. The image domain coordinates may be in theform of (i_(x), i_(y)). The training data generating device 1100 mayobtain vanishing point information vl from the driving image captured bythe camera sensor 1110.

The training data generating device 1100 may convert 1145 the imagedomain coordinates (i_(x), i_(y)) of the first dynamic object positionedwithin a predetermined matching distance 1160 among the dynamic objectsinto the first world domain coordinates. As described in detailhereinafter, the training data generating device 1100 may convert theimage domain coordinates (i_(x), i_(y)) of the first dynamic objectposition within the predetermined matching distance 1160 among thedynamic objects into the first world domain coordinates by thehomography operation. After the initial training of the neural networkis completed, the training data generating device 1100 may increase thematching distance 1160 by using the previously trained neural networkinstead of the homography operation for converting the image domaincoordinates.

At this time, the training data generating device 1100 may provide theID (for example, the first ID) to the first dynamic object.

In addition, the training data generating device 1100 may obtain thesecond world domain coordinates of the peripheral objects by tracking1150 the peripheral objects of the vehicle by using the distance sensor1120. The second world domain coordinates may be in the form of (W_(x),W_(y)). At this time, the distance sensor 1120 may output distances andangles between the vehicle and the peripheral objects. The training datagenerating device 1100 may provide the ID (for example, the second ID)to the tracked peripheral objects.

The training data generating device 1100 may generate training data 1177by comparing the first world domain coordinates of the first dynamicobject with the second world domain coordinates of the peripheralobjects and matching 1173 the first world domain coordinates of thefirst dynamic object with the second world domain coordinates of theperipheral objects at a short range (for example, within the matchingdistance).

At this time, the training data generating device 1100 may accumulatethe training data by obtaining the image domain coordinates (i_(x),i_(y)) of the first dynamic object and the vanishing line information vlin the driving image from the camera sensor and obtaining the secondworld domain coordinates (W_(x), W_(y)) of the peripheral objects fromthe distant sensor 1120. The training data generating device 1100 mayobtain the pitch information p of the vehicle by the IMU sensor 1130 andmay use the obtained pitch information p for generating the trainingdata 1177.

After initially training the neural network by the training datagenerated by the homography operation performed on the image domaincoordinates (i_(x), i_(y)) of the first dynamic object, the trainingdata generating device 1100 may generate the training data whilegradually increasing the matching distance to a long distance. Forexample, in an embodiment, after the initial training, the previouslytrained neural network is used instead of the homography operation whenthe image domain coordinates are converted into the first world domaincoordinates, and accordingly, the matching distance may be increased. Amethod, performed by the training data generating device 1100 accordingto an embodiment, of increasing the matching distance 1160 andgenerating the training data will be described in detail with referenceto FIG. 12.

FIG. 12 is a view illustrating a method of generating training data fora dynamic object in a driving image, according to an embodiment of thedisclosure.

Referring to FIG. 12A, driving images 1210, 1220, and 1230, in which avehicle 1205 that is moving on a road is captured, are illustrated. Itis assumed that the driving image 1210 is captured at a point in time atwhich t=0 and the vehicle 1205 is positioned within the matchingdistance, and the driving image 1220 and the driving image 1230 arerespectively captured at points in time at which t=1 and t=2 and thevehicle 1205 is positioned at distances greater than the matchingdistance.

In addition, referring to FIG. 12B, processes of generating the trainingdata by gradually matching the image domain coordinates (i_(x), i_(y))of the dynamic object (for example, the vehicle 1205) obtained by thecamera sensor with the second world domain coordinates (W_(x), W_(y)) ofthe peripheral objects obtained by the distance sensor at the points intime t=n, t=n+1, and t=n+2 are illustrated.

For example, assuming that n=0, the training data generating device mayconvert the image domain coordinates (i_(x0), i_(y0)) of the vehicle1205 positioned within the matching distance into first world domaincoordinates (W_(x0), W_(y0)) by the homography operation at the point intime at which t=0. At this time, the training data generating device mayprovide an ID a to the vehicle 1205.

The training data generating device may obtain second world domaincoordinates (W_(x)*, W_(y)*) of the peripheral objects obtained by thedistance sensor at the point in time at which t=0. At this time, thetraining data generating device may provide IDs to the peripheralobjects. For example, the training data generating device may provide anID b to the vehicle 1205 among the peripheral objects obtained by thedistance sensor.

The training data generating device compares the second world domaincoordinates (W_(x)*, W_(y)*) of the peripheral objects with the firstworld domain coordinates (W_(x0), W_(y0)) and may match one (forexample, a vehicle as a peripheral object having second world domaincoordinates (W_(x0)*, W_(y0)*)) closest to the first world domaincoordinates (W_(x0), W_(y0)) with the vehicle 1205. By performing thematching, the training data generating device determines that thedynamic object with ID=a tracked by the camera sensor, is the same asthe peripheral object with ID=b tracked by the distance sensor and maygenerate the training data (i_(x0), i_(y0), W_(x0)*, W_(y0)*).

At this time, the matching performed at the point in time at which t=0within the matching distance may be also maintained as ID=a=b at thepoints in time at which t=1 and t=2 and the vehicle 1205 is positionedwithin a distance greater than the matching distance. The training datagenerating device may generate training data (i_(x1), i_(y1), W_(x1)*,W_(y1)*) by using the second world domain coordinates (W_(x0)*, W_(y0)*)and first world domain coordinates (W_(x1), W_(y1)) of the peripheralobject with ID=b tracked by the distance sensor at the point in time atwhich t=1. The training data generating device may generate trainingdata (i_(x2), i_(y2), W_(x2)*, W_(y2)*) at the point in time at whicht=2 by the same method as at the point in time at which t=1.

Although not shown in the drawing, according to an embodiment, thetraining data generating device may store an object tracking history ofthe camera sensor and an object tracking history of the distance sensor.In this case, training data at a point in time before the matching isperformed may be additionally generated by using the object trackinghistories after the matching is performed.

FIG. 13 is a view illustrating a method of accumulatively generatingtraining data, according to an embodiment of the disclosure.

Referring to FIG. 13, training data items accumulatively generated bythe training data generating device according to an embodiment usingdifferent converters in accordance with the matching distance areillustrated.

The training data generating device according to an embodiment convertsimage domain coordinates of a first dynamic object positioned within thefirst matching distance (for example, 12 m) among the dynamic objectstracked by the driving image into first world domain coordinates by ahomography operator and may generate training data 0 as initial trainingdata based on the first world domain coordinates. Then, after trainingthe neural network by training data 0, the training data generatingdevice may accumulatively generate training data items 1 and 2 whileincreasing the matching distance as the number of repetitions graduallyincreases (for example by once and twice) and converting the imagedomain coordinates of the first dynamic object by the neural network. Asdescribed with reference to FIG. 12, after the matching is performedwithin the matching distance, the training data may be generatedalthough deviating from the matching distance. Therefore, the neuralnetwork is trained to be converted at a greater matching distance thanthe first matching distance and a degree of matching correctness at along range may gradually increase as iterations are performed.

FIG. 14 is a flowchart illustrating a method of generating training databased on coordinates of a still object in a driving image, according toan embodiment of the disclosure. Referring to FIG. 14, the training datagenerating device (or a processor of the training data generatingdevice) according to an embodiment stores image domain coordinates of astill object by tracking the still object from the driving imageincluding a plurality of frames over the lapse of time in operationS1410. At this time, the training data generating device may provide afirst ID to the tracked still object. According to an embodiment, inoperation S1410, the training data generating device may track the imagedomain coordinates of the still object over the lapse of time. Thetraining data generating device may convert a type of the image domaincoordinates of the still object into the floating point by filtering theimage domain coordinates of the tracked still object. In addition, thetraining data generating device may perform the scaling normalization onthe image domain coordinates of the still object in the driving imagebased on the vanishing line information in the driving image.

The training data generating device converts image domain coordinates ofa current frame among the image domain coordinates into first globalworld domain coordinates based on global positioning system (GPS)information in operation S1420.

The training data generating device obtains second global world domaincoordinates of peripheral objects based on an output of the distancesensor and the GPS information in operation S1430. At this time, thetraining data generating device may provide second IDs to the peripheralobjects. In addition, the training data generating device mayaccumulatively store the output of the distance sensor and the GPSinformation.

The training data generating device matches one of the peripheralobjects with the still object by comparing the first global world domaincoordinates with the second global world domain coordinates in operationS1440. The training data generating device may match the second IDprovided to one of the peripheral objects with the first ID provided tothe still object.

The training data generating device generates the training data inoperation S1450. At this time, each of the training data items mayinclude the second global world domain coordinates of a peripheralobject matched with one of the image domain coordinates stored inoperation S1410.

FIG. 15 is a configuration diagram of a training data generating device1500 for still objects according to an embodiment of the disclosure.

Referring to FIG. 15, the training data generating device 1500 accordingto an embodiment may include a camera sensor 1510, a distance sensor1520, a GPS sensor 1530, an IMU sensor 1540, and a processor 1560.

The training data generating device 1500 may capture a driving imageincluding a plurality of frames over the lapse of time by the camerasensor 1510. The training data generating device 1500 tracks 1550 imagedomain coordinates of a still object from the driving image over thelapse of time and may store the tracked image domain coordinates. Atthis time, the training data generating device 1500 may provide the ID(for example, the first ID) to the still object.

The training data generating device 1500 may convert a type of the imagedomain coordinates of the still object into the floating point byfiltering the tracked image domain coordinates of the still object. Thetraining data generating device 1500 may perform the scalingnormalization on the image domain coordinates of the still objectconverted into the floating point based on the vanishing lineinformation in the driving image.

The training data generating device 1500 may obtain the vanishing pointinformation vl from the driving image captured by the camera sensor1510.

The training data generating device 1500 may convert image domaincoordinates of a current frame among the image domain coordinates of thestill object into the first global world domain coordinates based on theGPS information sensed by the GPS sensor 1530.

The training data generating device 1500 may obtain the second globalworld domain coordinates of the peripheral objects of the vehicle basedon the output (W_(x), W_(y)) of the distance sensor 1520 and the GPSinformation.

The training data generating device 1500 compares the first global worlddomain coordinates with the second global world domain coordinates andmay match 1563 the second ID provided to one of the peripheral objectswith the first ID provided to the still object. At this time, thetraining data generating device 1500 may match the second ID provided toone of the peripheral objects with the first ID provided to the stillobject.

The training data generating device 1500 may generate training data 1567to include the second global world domain coordinates of the peripheralobject matched with one of the previously stored image domaincoordinates. The training data generating device 1500 obtains the pitchinformation p of the vehicle through the IMU sensor 1540 and may use thepitch information p for generating the training data 1567.

A method, performed by the training data generating device 1500according to an embodiment, of generating the training data by trackingthe image domain coordinates of the still object over the lapse of timewill be described in detail with reference to FIG. 16.

FIG. 16 is a view illustrating a method of generating training data on astill object in a driving image, according to an embodiment of anembodiment of the disclosure.

Referring to FIG. 16, driving images 1610, 1620, and 1630 of a vehicle1605 with over the lapse of time and drawings 1615, 1625, and 1635illustrating GPS information on the vehicle 1605 and a peripheral object1607 to correspond to the driving images 1610, 1620, and 1630 areillustrated. It is assumed that a still object 1603 included in thedriving images 1610, 1620, and 1630 and the peripheral object 1607illustrated in the drawings 1615, 1625, and 1635 are the same object(for example, a street lamp). In addition, it is assumed that thedriving image 1610 is captured at a point in time at which t=n and thedriving image 1620 and the driving image 1630 are respectively capturedat a point in time at which t=n+1 and at a point in time at which t=n+2.

The training data generating device according to an embodiment may storeimage domain coordinates (i_(x), i_(y)) of the still object 1603 bytracking the still object 1603 from the driving images 1610, 1620, and1630 over the lapse of time. The training data generating device mayaccumulatively store the output (W_(x)*, W_(y)*) of the distance sensor(for example, the radar/Lidar sensor) and the GPS information (GPS_(x),GPS_(y)) over the lapse of time.

Because the still object 1603 does not have mobility, a position thereofis fixed. However, when the vehicle 1605 is remote from the still object1603, a conversion of the image domain coordinates is incorrect andaccordingly, matching is not performed. Therefore, after the trainingdata generating device accumulatively stores the coordinates of thestill object 1603 captured in accordance with the movement of thevehicle 1605, when the vehicle 1605 is close enough to the peripheralobject 1607 corresponding to the still object 1603 and accordingly,matching is performed, training data on the coordinates of theaccumulatively captured still object 1603 may be generated.

In more detail, the training data generating device may obtain the imagedomain coordinates (i_(x0), i_(y0)) of the still object 1603 in thedriving image 1630 at the point in time at which t=n obtained by thecamera sensor and may store the obtained image domain coordinates(i_(x0), i_(y0)). The training data generating device may provide anID=1 to the still object 1603.

The training data generating device may convert the image domaincoordinates (i_(x0), i_(y0)) of the driving image 1630 into the firstworld domain coordinates (W_(x0), W_(y0)) at the point in time at whicht=n. The training data generating device may convert the first worlddomain coordinates (W_(x0), W_(y0)) into first global world domaincoordinates (W_(x0) , W_(y0) ) based on GPS information (GPS_(x0),GPS_(y0)) obtained by the GPS sensor.

In addition, the training data generating device may obtain secondglobal world domain coordinates (W_(x0)*, W_(y0)*) of the peripheralobject 1607 based on the output (W_(x0)*, W_(y0)*) of the distancesensor and the GPS information (GPS_(x0), GPS_(y0)) at the point in timeat which t=n.

The training data generating device may convert the image domaincoordinates (i_(x1), i_(y1)) and (i_(x2), i_(y2)) of the driving images1620 and 1630 into first global world domain coordinates (W_(x1) ,W_(y1) ) and (W_(x2) , W_(y2) ) at the point in time at which t=n+1 andat the point in time at which t=n+2 in the same method at the point intime at which t=n.

In addition, the training data generating device may obtain secondglobal world domain coordinates ((W_(x1)*, W_(y1)*), (W_(x2)*, W_(y2)*))of the still object 1607 based on an output ((W_(x1)*, W_(y1)*),(W_(x2)*, W_(y2)*)) of the distance sensor and GPS information((GPS_(x1), GPS_(y1)), (GPS_(x2), GPS_(y2))) like in the drawings 1625and 1635 at the point in time at which t=n+1 and at the point in time atwhich t=n+2 in the same method at the point in time at which t=n. Thetraining data generating device may accumulatively store the imagedomain coordinates that is an output of the camera sensor, outputs ofthe distance sensor, the GPS information, and the second global worlddomain coordinates of the still object 1607.

In the drawing 1615 corresponding to the point in time at which t=n, adifference (an error) is generated between the first global world domaincoordinates (W_(x0) , W_(y0) ) of the still object 1603 obtained by thecamera sensor and the second global world domain coordinates (W_(x0)*,W_(y0)*) of the peripheral object 1607 obtained by the distance sensor.The difference (the error) is reduced as a distance between the vehicle1605 and the peripheral object 1607 is gradually reduced over the lapseof time and may be removed or substantially removed such that the erroris negligible at the point in time at which t=n+2.

The training data generating device compares the first global worlddomain coordinates to the second global world domain coordinates of thestill object 1603 at the point in time at which t=n+2 and the vehicle1605 is closest to the peripheral object 1607 (or the vehicle 1605 movespast the still object 1603) and may match the still object 1603 with theperipheral object 1607. The training data generating device may matchthe still object 1603 with the ID=1 with the peripheral object 1607.Through the matching, the training data generating device may generate atraining data set ((i_(x0), i_(y0), W_(x2)*, W_(y2)*), (i_(x1), i_(y1),W_(x2)*, W_(y2)*), (i_(x2), i_(y2), W_(x2)*, W_(y2)*)) for the stillobject 1603 with the ID=1 at a time.

FIG. 17 is a block diagram of an image processing device 1700 accordingto an embodiment of an embodiment of the disclosure.

Referring to FIG. 17, the image processing device 1700 according to anembodiment includes a processor 1730. The image processing device 1700may further include sensors 1710, a memory 1750, a communicationinterface 1770, and a display 1790. The sensors 1710, the processor1730, the memory 1750, the communication interface 1770, and the display1790 may communicate with each other through a communication bus 1705.

The sensors 1710 may include, for example, the camera sensor, an imagesensor, a vision sensor, the IMU sensor, the gyro sensor, anacceleration sensor, the GPS sensor, a terrestrial magnetic sensor, theLidar sensor, the radar sensor, and an altitude measurement sensor.However, an embodiment of the disclosure is not limited thereto. Thecamera sensor, the image sensor, and/or the vision sensor may be mountedin a vehicle and may capture a driving image of the vehicle. The IMUsensor, the gyro sensor, and/or the altitude measurement sensor maysense the pitch information of the vehicle. The Lidar sensor and/or theradar sensor may sense (local) world domain coordinates of an object.The GPS sensor may sense global world domain coordinates of the vehicle.

The processor 1730 may perform at least one method described abovethrough FIGS. 1 to 16 or an algorithm corresponding to the at least onemethod. That is, the various blocks illustrated in the figures may beimplemented as hardware or software under control of execution by theprocessor 1730. The processor 1730 may execute a program representativeof the various blocks illustrated in the figures and may control theimage processing device 1700. The program code executed by the processor1730 may be stored in the memory 1750.

The processor 1730 may be formed of, for example, a central processingunit (CPU) or a graphics processing unit (GPU).

The memory 1750 may store information on a driving image and an altitudedifference between the vehicle and the object. In addition, the memory1750 may store image domain coordinates of the object tracked with thelapse of time. In addition, the memory 1750 may store a live mapgenerated by the processor 1730 to correspond to the dynamic objectand/or a landmark map generated to correspond to the still object.

The memory 1750 may store world domain coordinates of the objectdetermined by the processor 1730.

World domain coordinates of the still object stored in the memory 1750may be read from memory 1750 for rapidly grasping information that doesnot change such as a crosswalk, a sign, a lane, and surrounding terrainwhen the vehicle passes by the same area. Considering that the vehiclemoves in the same route when the vehicle is used for a commute, by usingthe information previously stored in the memory 1750, it is possible toimprove an image processing speed for determining the world domaincoordinates of the still object and to reduce processing load. Thememory 1750 may be a volatile memory or a non-volatile memory.

The communication interface 1770 may receive the driving image capturedby an external source outside of the image processing device 1700, suchas a traffic camera or a camera mounted to another vehicle or structure,or information of various sensors received from the outside of the imageprocessing device 1700 and map information. According to an embodiment,the communication interface 1770 may transmit the world domaincoordinates of the object determined by the processor 1730 to theoutside of the image processing device 1700 or the display 1790.

The display 1790 may display the world domain coordinates of the objecttogether with the driving image or may additionally display the worlddomain coordinates of the object. The display 1790 may display the worlddomain coordinates of the object as, for example, map information, aposition of the object in a navigation image, or the world domaincoordinates of the object. For example, when the image processing device1700 is embedded in the vehicle, the display 1790 may be formed of ahead up display (HUD) provided in the vehicle.

The embodiments of the disclosure may be implemented by a hardwarecomponent, a software component, and/or a combination of the hardwarecomponent and the software component. For example, the device, themethod, and the components that are described in the embodiments may beimplemented by using one or more common purpose computers or specialpurpose computers like a processor, a controller, an arithmetic logicunit (ALU), a digital signal processor, a microcomputer, a fieldprogrammable gate array (FPGA), a programmable logic unit (PLU), amicroprocessor, or another certain device capable of executing andresponding an instruction. The processing device may perform anoperating system (OS) and one or more software applications performed onthe OS. In addition, the processing device may access, store,manipulate, process, and generate data in response to execution ofsoftware. For convenience sake, it is illustrated that only oneprocessing device is used. However, those skilled in the art mayunderstand that the processing device may include a plurality ofprocessing elements and/or a plurality types of processing elements. Forexample, the processing device may include a plurality of processors ora processor and a controller. In addition, another processingconfiguration such as a parallel processor is available.

The software may include a computer program, code, an instruction, orone or more combinations of the computer program, the code, and theinstruction and may configure the processing device so as to operate asdesired or independently or collectively instruct the processing device.The software and/or the data may be permanently or temporarily embodiedin a certain type of machine, a component, a physical device, virtualequipment, a computer storage medium or device, or a transmitted signalwave in order to be interpreted by the processing device or to providethe instruction or the data to the processing device. The software isdispersed on a computer system connected by a network and may be storedor executed by a dispersed method. The software and the data may bestored in one or more computer-readable recoding media.

The method according to the embodiment may be implemented in the form ofa program instruction that may be performed by various computer unitsand may be recorded in a computer-readable recording medium. Thecomputer-readable recording medium may include a program instruction, adata file, and a data structure or a combination of the programinstruction, the data file, and the data structure. The programinstruction recorded in the computer-readable recording medium isspecially designed configured for the embodiment or may be well known tosoftware engineers. The computer-readable recording medium may be, forexample, a magnetic medium such as a hard disc, a floppy disc, or amagnetic tape, an optical medium such as a compact disc read only memory(CD-ROM) or a digital versatile disc (DVD), a magneto-optical mediumsuch as a floptical disc, or a hardware device specially configured tostore and perform the program instruction such as a ROM, random accessmemory (RAM), or flash memory. The program instruction may includehigh-level language code that may be executed by a computer by using aninterpreter as well as a machine language code created by a compiler.The hardware device may be configured to operate as one or more softwaremodules in order to perform the operation of the embodiment and thereverse is available.

While embodiments of the disclosure have been particularly shown anddescribed, it will be understood that various changes in form anddetails may be made therein without departing from the spirit and scopeof the following claims.

1. An image processing method for controlling a vehicle on a road, theimage processing method comprising: detecting an object within a drivingimage; obtaining an altitude difference between the vehicle and theobject; determining world domain coordinates of the object by a neuralnetwork processing of image domain coordinates of the object in thedriving image and the altitude difference; and controlling the vehicleon the road with respect to the object based on the world domaincoordinates of the object.
 2. The image processing method of claim 1,wherein the altitude difference comprises: pitch information of thevehicle; and vanishing line information in the driving image.
 3. Theimage processing method of claim 2, further comprising performingscaling normalization on the image domain coordinates of the object inthe driving image based on the vanishing line information in the drivingimage.
 4. The image processing method of claim 1, further comprising:tracking the image domain coordinates of the object over lapse of time;and filtering the image domain coordinates of the object and convertinga type of the image domain coordinates of the object into a floatingpoint.
 5. The image processing method of claim 1, wherein the object iseither a dynamic object with mobility or a still object withoutmobility, and wherein the neural network comprises: a first neuralnetwork for estimating world domain coordinates of the dynamic object;and a second neural network for estimating world domain coordinates ofthe still object.
 6. The image processing method of claim 1, wherein theobject is a dynamic object with mobility, and wherein the method furthercomprises: generating a live map corresponding to the dynamic object byusing a result of converting image domain coordinates of the dynamicobject into world domain coordinates; and generating a driving parameterof the vehicle for controlling the vehicle on the road with respect tothe dynamic object by using the live map.
 7. The image processing methodof claim 1, wherein the object is a still object without mobility, andwherein the method further comprises: generating a landmark mapcorresponding to the still object by using a result of converting imagedomain coordinates of the still object into world domain coordinates;and determining at least one of a position and a route of the vehiclefor controlling the vehicle on the road with respect to the still objectby using the landmark map.
 8. The image processing method of claim 1,further comprising outputting the world domain coordinates of theobject.
 9. The image processing method of claim 1, further comprisingobtaining the driving image captured by a camera mounted in the vehicle.10. A training data generating method comprising: obtaining image domaincoordinates of dynamic objects by tracking the dynamic objects within ina driving image; converting image domain coordinates of a first dynamicobject among the dynamic objects into first world domain coordinates ofthe first dynamic object, wherein the first dynamic object is positionedwithin a predetermined matching distance from a vehicle; obtainingsecond world domain coordinates of peripheral objects by tracking theperipheral objects by using a distance sensor; matching one of theperipheral objects with the first dynamic object by comparing the firstworld domain coordinates and the second world domain coordinates; andgenerating training data including the image domain coordinates of thefirst dynamic object and the second world domain coordinates of thematched peripheral object.
 11. The training data generating method ofclaim 10, wherein the converting of the image domain coordinatescomprises converting initial image domain coordinates of the firstdynamic object into the first world domain coordinates by a homographyoperation.
 12. The training data generating method of claim 10, furthercomprising: associating a first identifier (ID) with the first dynamicobject; and associating second IDs with the peripheral objects, whereinthe matching of one of the peripheral objects with the first dynamicobject comprises matching a second ID among the second IDs associatedwith one of the peripheral objects with the first ID associated with thefirst dynamic object.
 13. The training data generating method of claim10, wherein the dynamic objects comprise at least one of peripheralvehicles, pedestrians, and animals.
 14. The training data generatingmethod of claim 10, further comprising: tracking the image domaincoordinates of the dynamic objects over lapse of time; and converting atype of the image domain coordinates of the dynamic objects into afloating point by filtering the tracked image domain coordinates of thedynamic objects.
 15. The training data generating method of claim 10,further comprising performing scaling normalization on the image domaincoordinates of the dynamic objects in the driving image based onvanishing line information in the driving image.
 16. A training datagenerating method comprising: storing image domain coordinates of astill object by tracking the still object from a driving image includinga plurality of frames over lapse of time; converting image domaincoordinates of a current frame among the image domain coordinates intofirst global world domain coordinates based on global positioning system(GPS) information; obtaining second global world domain coordinates ofperipheral objects based on an output of a distance sensor and the GPSinformation; matching one of the peripheral objects with the stillobject by comparing the first global world domain coordinates with thesecond global world domain coordinates; and generating a plurality oftraining data, each training data of the plurality of training dataincludes one of the stored image domain coordinates and the secondglobal world domain coordinates of the matched peripheral object. 17.The training data generating method of claim 16, further comprising:associating a first ID with the still object; and associating second IDswith the peripheral objects, wherein the matching one of the peripheralobjects with the still object comprises matching a second ID among thesecond IDs associated with one of the peripheral objects with the firstID associated with the still object.
 18. The training data generatingmethod of claim 16, wherein the still object comprises at least one ofbuildings, signs, traffic lights, a crosswalk, a stop line, and adriving line included in the driving image.
 19. The training datagenerating method of claim 16, further comprising: tracking the imagedomain coordinates of the still object over lapse of time; andconverting a type of the image domain coordinates of the still objectinto a floating point by filtering the tracked image domain coordinatesof the still object.
 20. The training data generating method of claim16, further comprising performing scaling normalization on the imagedomain coordinates of the still object in the driving image based onvanishing line information in the driving image. 21-32. (canceled)